compaq_armada_small_bw.gif (5457 bytes) Alf P. Steinbach´s personal homepage: Programming: Narrow topics

The Ackermann function(s)

Contents:

About the notation.
: The Ackermann function.
: Generalized arithmetic operations.

After obtaining my B.Sc. from Heriot Watt University I went unemployed for about three years, middle 1987 to about 1990. (Except for my military service, every period of my life has been three years.) This was originally due to my sponsoring company, where I'd intended to work, going for Chapter 11 (you may perhaps remember the scandal about silent submarine propellers to the Soviets, á la the later motion picture "Red October"; the firm that went bonkers was Kongsberg Våpenfabrikk). Thus I had much free time to fill in, and did a lot of reading, writing and programming. When I read a letter to the editor in BYTE magazine, where a Canadian lecturer wondered what the #&/"!?!! the Ackermann function actually does, stating that he'd posed the problem to his Computer Science students for years without any solution forthcoming, I got interested. Not the least because as a student at Kongsberg College of Engineering (Norway) the same problem had been given by one of our lecturers, and nobody could find a clue.

The Ackermann function is a doubly recursive function, which simply explodes when argument values are increased from zero. Thus it's a very good teaching device for Computer Science students learning about recursion. All their efforts at computing this seemingly simple function ends with programs that hang or crash. You can try out the hanging behavior using this Java applet. Depending on the quality of your browser's Java the browser may also crash.

Of course the good Ackermann himself (or herself?) must have known what the function is, and I later discovered (e.g. via an article in Scientific American) that these functions, of which the Ackermann function is a kind of common special case, are routinely used in connection with algorithm complexity analysis. But I used an evening calculating, spotting patterns, finding a solution and submitting my solution as a letter to BYTE, and felt pretty good about it when it appeared in print. It turned out that to find that solution I had to use some ideas from a course I attended in abstract data structures at HWU, namely generalized arithmetic operations. Also, I got a fresh new view of the old saying that "two and two is four", which turned out to be much more general than I'd thought!

Last updated Thursday, February 04, 1999

The Ackermann function(s)

About the notation.

Red bold font indicate a definition. The notation "<>" is the "not equal" operator. The symbol "·" is multiplication.

The Ackermann function.

The Ackermann function is defined for natural numbers (non-negative integers) as follows:

Ack( 0, n )	= n + 1.	for n >= 0
Ack( m, 0 )	= Ack( m - 1, 1 ).	for m > 0
Ack( m, n )	= Ack( m - 1, Ack( m, n - 1 ) ).	for m > 0 and n > 0

For m = 0 the function is as trivial as it's possible to be, disregarding the identity function, namely the (Peano) successor function n + 1. Going in the other direction, varying m while keeping n = 0, the behavior is a bit more interesting. Using a simple Java applet for computing, the first few values for m = 0, 1, 2 and so on are 1, 2, 3, 5, 13, and, uh, the applet's still, as you read this, computing the value for m = 5...

Obviously high values of m cause the function to head straight for some distant galaxy cluster. Where a "high" value is, for example, 5. However, as long as m is kept reasonably small, less than 4, n can be varied without the function exploding. Here is a table:

	n = 0	n = 1	n = 2	n = 3	n = 4	n = 5	n = 6	n = 7	n = 8
m = 1	2	3	4	5	6	7	8	9	10
m = 2	3	5	7	9	11	13	15	17	19
m = 3	5	13	29	61	125	253	509	1021	2045

The power of today's personal computers means current students do not have to struggle as we did: the pattern of the function becomes obvious in the bottom right part of the table, which once took some time to compute, instead of an eyeblink. The pattern for m = 3 is a near but not quite perfect doubling of the function value for each increment of n, that is, the function produces powers of 2. And when one checks the values, they're exact powers of 2 offset by 3.

Thus, for m = 3 the following formula seems to hold (and in fact it does hold):

Ack( 3, n ) = 2ⁿ⁺³ - 3

This gives some food for thought regarding m < 3. Replacing exponentiation (the "power of" operation) by respectively multiplication and addition one obtains

Ack( 2, n ) = 2·(n + 3) - 3 = 2·n + 3

Ack( 1, n ) = 2 + (n + 3) - 3 = n + 2

Now this looks suspiciously like a pattern, and indeed it is. For m = 0 one goes below addition, to the Peano successor function, and for m = 4 one goes above exponentiation, to the tower function, so called because of the resulting tower of exponents when a value is written in terms of exponentiation. Arithmetic operations above the tower function do not have individual names, as far as I know; instead these functions are just collectively referred to as Ackermann functions.

Generalized arithmetic operations.

Arithmetic operations (addition, multiplication, exponentiation, possibly including their inverse functions and functions above exponentiation) can be defined in many seemingly fundamentally different ways, along with corresponding definitions of natural numbers. From the point of view of "interesting definitions" the most interesting I know is Church's definition of natural numbers as lambda functions (lambda calculus is the underlying mathematical framework for Lisp), where addition is more complex than multiplication, which in turn is more complex than exponentiation, which in this scheme is the simplest arithmetic operation! However, while interesting in the sense of elegance and mathematical surprise, it's not a very practical way to go here.

Instead let's start by defining natural numbers in terms of 0 and the successor function s. Conceptually s( x ) = x + 1, but addition isn't defined yet: it will be defined in terms of s. Thus, 1 = s( 0 ), 2 = s( 1 ) = s( s( 0 ) ), 3 = s( 2 ) = s( s( s( 0 ) ) ), and so on.

Addition of natural numbers can now be defined as

x + 0 = x.

x + s( y ) = s( x + y ).

which implies that x + 1 = x + s( 0 ) = s( x + 0 ) = s( x ).

Similarly, multiplication can be defined in terms of addition,

x·0 = 0.
x·(y + 1) = x·y + x.

where the idea is that x·n should expand to x + x + x + ... + x, with n occurrences of x.

Using the same idea (expansion to a series of applications of the underlying operation), exponentiation can be defined in terms of multiplication:

x⁰ = 1.
x^{y + 1} = x^y·x.

However, addition cannot be defined in terms of an underlying two-argument function in the same way:

Given

x + 2 = x ? x.
x + (y + 1) = (x + y) ? x.

one would have

x + (0 + 1) = (x + 0) ? x.
x + 1 = x ? x = x + 2.

which is a contradiction.

Thus addition is somewhat a special case, not conforming to the general pattern of the definitions.

At this point two serious problems can be noted with the approach. First of all, it requires pre-knowledge. That is, although there is some similarity between the definitions, each new definition requires knowledge of the operation to be defined, e.g. that x⁰ = 1. Second, this approach is not general. That is, one can cannot refer to operation number k, but only to individual operations such as addition and multiplication.

The main problem is the requirement of pre-knowledge (or analysis of each function to be defined), which can be dispensed with by using 2 as the base case (second argument value) instead of 1 or 0. The operation result for argument values less than the base case will have to be deduced by requiring that general function properties hold for all, or at least most, argument values. 1 is not suitable as a base case because the meaning of "zero applications of the underlying operation" for x op 1 is not guaranteed to be simply x, although this will turn out to be the case for multiplication and above.

Thinking ahead to the tower function, left-associative definitions, as above, would yield a pretty lame tower function, where one would always be able to use simple exponentiation instead of the tower function. In order to illustrate this, let's denote exponentiation by the symbol "^", x^y = x^y, and the tower function by the symbol "^^". Using a left-associative definition one would have 5^^3 = (5^5)^5 = 5^(5·5), about 2.980·10¹⁷, and in general a^^b = a^(a^(b-1)). On the other hand, using a right-associative definition one would have 5^^3 = 5^(5^5), about 1.911·10²¹⁸⁴. Using the right-associative definition tower function applications cannot in general be re-expressed as simple exponentiations, yielding a far more potent tower function (not only in the numeric values produced!). For these reasons exponentiation is conventionally regarded as right-associative, the conventional tower function is the one obtained by a right-associative definition, and that's also the function involved in the Ackermann function for m = 4.

Finally, instead of using "y + 1" on the left-hand side and "y" on the right, let's use simply "y" on the left-hand side and "y - 1" on the right, which is a bit clearer (and incidentally corresponds better to the the Ackermann function definition).

The definitions now look like this,

x + 0 = x.
x + s( y ) = s( x + y ).

x·2 = x + x.
x·y = x + (x·(y - 1)).

x² = x·x.
x^y = x·( x^{y - 1} ).

which, being of the exact same form from multiplication on, can be easily generalized:

x op₁ y	= x + y.
x op_m 2	= x op_m-1 x.	for m > 1
x op_m y	= x op_m-1 (x op_m (y - 1)).	for m > 1

where x·y = x op₂ y, and x^y = x op₃ y.

Hypothesizing that the Ackermann function can be expressed as

Ack( m, n ) = (2 op_m (n + 3)) - 3

(for m > 0), it's convenient to also define

x op₀ y

= y + 1.

to make the hypothesis valid also for m = 0, since the table of Ackermann values seems to indicate a simple successor function here. As a (critical) bonus, with this definition the arithmetic operation recursive case x op_m y = x op_m-1 (x op_m (y - 1)) becomes valid for m = 1.

The hypothesis can now be checked by simple substitution:

Ack( 0, n )	= (2 op₀ (n + 3)) - 3	by hypothesis
	= n + 1.

Ack( m, 0 )	= (2 op_m 3) - 3	by hypothesis
	= (2 op_m-1 (2 op_m 2)) - 3	by recursive definition of op
	= (2 op_m-1 4) - 3	since "two and two is four"
	= Ack( m - 1, 1 ).

Ack( m, n )	= (2 op_m (n + 3)) - 3	by hypothesis
	= (2 op_m-1 (2 op_m (n + 2))) - 3	by recursive definition of op
	= (2 op_m-1 (((2 op_m (n + 2)) - 3) + 3)) - 3
	= Ack( m - 1, Ack( m, n - 1 ) ).

So the mysterious "+3" is seen as a device for reaching the value 4, which allows use of the fact that two and two is always four; and the equally mysterious "-3" is seen as a device for canceling the "+3" in the general recursive case. Of course, reconstructing the function like this is not that big a deal. The real mystery is how Ackermann could have arrived at the function in the first place.